AITopics | threshold state

Collaborating Authors

threshold state

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

2172fde49301047270b2897085e4319d-Supplemental.pdf

Neural Information Processing SystemsFeb-7-2026, 19:04:10 GMT

We define a Gibbs distribution associated with the problem and evaluate its normalization constant -thepartition functionZ. The partition function, and in particular its logarithm divided by the temperature - the free energy -, contains all the information we aim to understand in the problem. By taking the proper derivatives, and possibly add an external field, we can compute relevant macroscopic properties such as: average overlapwiththegroundtruth,averagelossachieved. Indisordered systemwehavetoconsider the additional complication given by the randomness.

artificial intelligence, overlap, transition, (14 more...)

Neural Information Processing Systems

Country:

Europe > France (0.05)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
Europe > Italy (0.04)

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback

2172fde49301047270b2897085e4319d-Paper.pdf

Neural Information Processing SystemsFeb-7-2026, 19:04:04 GMT

eigenvalue, threshold state, transition, (12 more...)

Neural Information Processing Systems

Country:

Europe > Switzerland (0.04)
North America > Canada (0.04)
Europe > Italy > Lazio > Rome (0.04)
(3 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

2172fde49301047270b2897085e4319d-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 19:03:52 GMT

finite size effect, threshold state, transition, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Supplementary Material of Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

Neural Information Processing SystemsOct-2-2025, 10:56:34 GMT

Finding the crystal means reconstructing the signal.

artificial intelligence, machine learning, overlap, (17 more...)

Neural Information Processing Systems

Country: Europe > France (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

Neural Information Processing SystemsOct-2-2025, 10:56:27 GMT

In many machine learning applications one optimizes a non-convex loss function; this is often achieved using simple descending algorithms such as gradient descent or its stochastic variations.

artificial intelligence, machine learning, threshold state, (14 more...)

Neural Information Processing Systems

Country: Europe > France (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.65)

Add feedback

We kindly thank the reviewers (R1

Neural Information Processing SystemsOct-2-2025, 10:56:17 GMT

We discuss the main points below. Why in Figure 1 the Hessian is not marginal before the transition? We will add in the supplementary a version of Fig.2 for a value of We will add clarifications in the text when it can give rise to confusion. Above the threshold this algorithm can find the solution. Why threshold energy is higher for more samples?

artificial intelligence, machine learning, transition, (12 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

Review for NeurIPS paper: Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

Neural Information Processing SystemsJan-22-2025, 13:48:48 GMT

Weaknesses: One of the main assumptions used in characterizing the "threshold states" at which the gradient flow dynamics appear to get trapped is that the Hessian is positive semidefinite. On the other hand, in figure 2 as the training loss crosses the threshold energy the minimal eigenvalues of the Hessian appear clearly negative, unless I am misunderstanding the figure. The authors do not appear to address this point. Could the soundness of this assumption also account for the inaccuracy of the computed value of alpha at which the phase transition occurs which can be seen in figure 4? The main result of the paper - the computation of the relative sample size alpha at which the phase transition occurs, doesn't seem to be very accurate when compared to experiments in figure 3 and 5. It would have also been helpful to plot this value in the figure in order to make this point clear. The discrepancy could be a result of finite size effects as the authors claim, but could also be a result of say the assumption made about the Hessian at the threshold states or the accuracy of the 1RSB ansatz.

complex dynamic, phase retrieval, simple neural network, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

Review for NeurIPS paper: Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

Neural Information Processing SystemsJan-22-2025, 13:48:42 GMT

The paper makes interesting contributions towards understanding non-convex optimization by studying a problem that is simple enough to allow for analytical calculations. Overall, there is a decent, well-supported agreement between theory and experiment (in particular, between the leading moments of the distribution of the threshold states as evaluated empirically and the computed moments). This paper is a valuable contribution to NeurIPS and should be accepted. Overall, however, we recommend various lines along which the paper could improve further to reach a wider audience, and we recommend that the authors revisit the author feedback before they submit their final version. First, the paper presentation is somewhat unusually difficult to follow from the perspective of the machine learning audience and could be improved by providing more background on known results that were used in the paper (e.g., the BPP transition or replica theory), if necessary in the appendix.

complex dynamic, phase retrieval, simple neural network, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.40)

Add feedback

From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima

Bonnaire, Tony, Biroli, Giulio, Cammarota, Chiara

arXiv.org Artificial IntelligenceMar-4-2024

We investigate the optimization dynamics of gradient descent in a non-convex and high-dimensional setting, with a focus on the phase retrieval problem as a case study for complex loss landscapes. We first study the high-dimensional limit where both the number $M$ and the dimension $N$ of the data are going to infinity at fixed signal-to-noise ratio $\alpha = M/N$. By analyzing how the local curvature changes during optimization, we uncover that for intermediate $\alpha$, the Hessian displays a downward direction pointing towards good minima in the first regime of the descent, before being trapped in bad minima at the end. Hence, the local landscape is benign and informative at first, before gradient descent brings the system into a uninformative maze. The transition between the two regimes is associated to a BBP-type threshold in the time-dependent Hessian. Through both theoretical analysis and numerical experiments, we show that in practical cases, i.e. for finite but even very large $N$, successful optimization via gradient descent in phase retrieval is achieved by falling towards the good minima before reaching the bad ones. This mechanism explains why successful recovery is obtained well before the algorithmic transition corresponding to the high-dimensional limit. Technically, this is associated to strong logarithmic corrections of the algorithmic transition at large $N$ with respect to the one expected in the $N\to\infty$ limit. Our analysis sheds light on such a new mechanism that facilitate gradient descent dynamics in finite large dimensions, also highlighting the importance of good initialization of spectral properties for optimization in complex high-dimensional landscapes.

gradient descent, initialization, landscape, (15 more...)

arXiv.org Artificial Intelligence

2403.02418

Country:

North America > United States (0.14)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)
Europe > Russia (0.04)
(4 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.97)

Add feedback

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

Mannelli, Stefano Sarao, Biroli, Giulio, Cammarota, Chiara, Krzakala, Florent, Urbani, Pierfrancesco, Zdeborová, Lenka

arXiv.org Machine LearningJun-12-2020

In many machine learning applications one optimizes a non-convex loss function; this is often achieved using simple descending algorithms such as gradient descent or its stochastic variations. The positive results obtained in practice are often hard to justify from the theoretical point of view, and this apparent contradiction between non-convex landscapes and good performance of simple algorithms is a recurrent problem in machine learning. A successful line of research has studied the geometrical properties of the loss landscape, distinguishing between good minima - that lead to good generalization error - and spurious minima - associated with bad generalization error. The results showed that in some regimes, for several problems from matrix completion [1] to wide neural networks [2, 3], spurious minima disappear and consequently under weak assumptions [4] gradient descent will converge to good minima. However, these results do not justify numerous other results showing that good and spurious minima are present, but systematically gradient descent works [5, 6]. In [7] it was theoretically shown that in a toy model - the spiked matrix-tensor model - it is possible to find good minima with high probability in a regime where exponentially many spurious minima are provably present. In [8] it was shown that this is due to the presence of the so-called threshold states in the landscape, that play a key role in the dynamics of the gradient flow [9,10]: at first attracting it, and successively triggering the converge towards lower minima under certain conditions [11, 12]. However, the spiked matrix-tensor model is an unsupervised learning model and it remained open whether the picture put forward in [7, 8] happens also in learning with neural networks.

artificial intelligence, machine learning, threshold state, (14 more...)

arXiv.org Machine Learning

2006.06997

Country:

Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)
Africa > Middle East > Tunisia > Ben Arous Governorate > Ben Arous (0.04)

Genre: Research Report > New Finding (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.76)

Add feedback